智能论文笔记

Shakebot: A Low-cost, Open-source Shake Table for Ground Motion Seismic Studies

Zhiang Chen , Devin Keating , Yash Shethwala , Aravind Adhith Pandian Saravanakumaran , Ramon Arrowsmith , Chris Madugo , Albert Kottke , Jnaneshwar Das

分类：机器人

2022-12-21

Our earlier research built a virtual shake robot in simulation to study the dynamics of precariously balanced rocks (PBR), which are negative indicators of earthquakes in nature. The simulation studies need validation through physical experiments. For this purpose, we developed Shakebot, a low-cost (under $2,000), open-source shake table to validate simulations of PBR dynamics and facilitate other ground motion experiments. The Shakebot is a custom one-dimensional prismatic robotic system with perception and motion software developed using the Robot Operating System (ROS). We adapted affordable and high-accuracy components from 3D printers, particularly a closed-loop stepper motor for actuation and a toothed belt for transmission. The stepper motor enables the bed to reach a maximum horizontal acceleration of 11.8 m/s^2 (1.2 g), and velocity of 0.5 m/s, when loaded with a 2 kg scale-model PBR. The perception system of the Shakebot consists of an accelerometer and a high frame-rate camera. By fusing camera-based displacements with acceleration measurements, the Shakebot is able to carry out accurate bed velocity estimation. The ROS-based perception and motion software simplifies the transition of code from our previous virtual shake robot to the physical Shakebot. The reuse of the control programs ensures that the implemented ground motions are consistent for both the simulation and physical experiments, which is critical to validate our simulation experiments.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Radu Timofte , Shuai Liu , Chaoyu Feng , Furui Bai , Xiaotao Wang , Lei Lei , Ziyao Yi , Yan Xiang , Zibin Liu

分类：计算机视觉

2022-11-07

The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.

translated by 谷歌翻译

Scaling Instruction-Finetuned Language Models

Hyung Won Chung , Le Hou , Shayne Longpre , Barret Zoph , Yi Tay , William Fedus , Yunxuan Li , Xuezhi Wang , Mostafa Dehghani , Siddhartha Brahma

分类：机器学习 | 自然语言处理

2022-10-20

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average). Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

translated by 谷歌翻译

Multi-level Explanation of Deep Reinforcement Learning-based Scheduling

Shaojun Zhang , Chen Wang , Albert Zomaya

分类：人工智能

2022-09-18

集群中的依赖性意识性工作计划是NP-HARD。最近的工作表明，深入的强化学习（DRL）能够解决它。管理员很难理解基于DRL的策略，即使它取得了显着的绩效增长。因此，基于复杂的模型调度程序并不容易获得对简单性的系统的信任。在本文中，我们提供了多层次的解释框架来解释基于DRL的调度的策略。我们将其决策过程剖析到工作级别和任务级别，并使用可解释的模型和规则近似于操作实践。我们表明，该框架为系统管理员的洞察力提供了对最先进的调度程序的见解，并揭示了有关其行为模式的鲁棒性问题。

translated by 谷歌翻译

Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

Reza Hosseini , Albert Chen , Kaixu Yang , Sayan Patra , Yi Su , Saad Eddin Al Orjany , Sishi Tang , Parvez Ahammad

分类：机器学习

2022-07-15

预测可帮助企业分配资源并实现目标。在LinkedIn，产品所有者使用预测来设定业务目标，跟踪前景和监视健康。工程师使用预测有效地提供硬件。开发一种预测解决方案来满足这些需求，需要对各种时间序列进行准确，可解释的预测，并以次数至季度的频率。我们提出了Greykite，这是一个用于预测的开源Python库，已在LinkedIn上部署了二十多种用例。它的旗舰算法Silverkite提供了可解释的，快速且高度灵活的单变量预测，可捕获诸如时期增长和季节性，自相关，假期和回归剂等效果。该库通过促进数据探索，模型配置，执行和解释来实现自我服务的准确性和信任。我们的基准结果显示了来自各个域的数据集的现成速度和准确性。在过去的两年中，金融，工程和产品团队的资源计划和分配，目标设置和进度跟踪，异常检测和根本原因分析的资源团队一直信任灰金矿的预测。我们希望灰金矿对具有类似应用的预测从业者有用，这些应用需要准确，可解释的预测，这些预测捕获了与人类活动相关的时间序列共有的复杂动力学。

translated by 谷歌翻译

Segmentation-free PVC for Cardiac SPECT using a Densely-connected Multi-dimensional Dynamic Network

Huidong Xie , Zhao Liu , Luyao Shi , Kathleen Greco , Xiongchao Chen , Bo Zhou , Attila Feher , John C. Stendahl , Nabil Boutagy , Tassos C. Kyriakides

分类：计算机视觉 | 机器学习

2022-06-24

在核成像中，有限的分辨率会导致影响图像清晰度和定量准确性的部分体积效应（PVE）。已证明来自CT或MRI的高分辨率解剖信息的部分体积校正（PVC）已被证明是有效的。但是，这种解剖学引导的方法通常需要乏味的图像注册和分割步骤。由于缺乏具有高端CT和相关运动伪像的混合体SPECT/CT扫描仪，因此很难获得准确的分段器官模板，尤其是在心脏SPECT成像中。轻微的错误注册/错误分段将导致PVC后的图像质量严重降解。在这项工作中，我们开发了一种基于深度学习的方法，用于快速心脏SPECT PVC，而无需解剖信息和相关的器官分割。所提出的网络涉及密集连接的多维动态机制，即使网络经过充分训练，也可以根据输入图像对卷积内核进行调整。引入了心脏内血容量（IMBV）作为网络优化的附加临床损失函数。提出的网络表明，使用Technetium-99M标记的红细胞在GE发现NM/CT 570C专用心脏SPECT扫描仪上获得的28个犬类研究表现有希望的表现。这项工作表明，与没有这种机制的同一网络相比，具有密集连接的动态机制的提议网络产生了较高的结果。结果还表明，没有解剖信息的提出的网络可以与解剖学引导的PVC方法产生的图像产生具有统计上可比的IMBV测量的图像，这可能有助于临床翻译。

translated by 谷歌翻译

Dual-Branch Squeeze-Fusion-Excitation Module for Cross-Modality Registration of Cardiac SPECT and CT

Xiongchao Chen , Bo Zhou , Huidong Xie , Xueqi Guo , Jiazhen Zhang , Albert J. Sinusas , John A. Onofrey , Chi liu

分类：人工智能 | 计算机视觉

2022-06-10

单光子发射计算机断层扫描（SPECT）是一种广泛应用的成像方法，用于诊断冠状动脉疾病。从计算机断层扫描（CT）得出的衰减图（U-MAP）用于衰减校正（AC），以提高心脏SPECT的诊断准确性。但是，SPECT和CT是在临床实践中依次获得的，这可能会导致两项扫描之间的误会。卷积神经网络（CNN）是医疗图像注册的强大工具。先前基于CNN的跨模式注册方法直接串联了两个输入模态作为早期特征融合或使用两个单独的CNN模块提取的图像特征，以进行晚期融合。这些方法不能完全提取或融合交叉模式信息。此外，以前尚未对心脏SPECT和CT衍生的U-MAP的深度学习刚性注册进行研究。在本文中，我们提出了一个双分支挤压融合 - 兴奋（DUSFE）模块，用于对心脏SPECT和CT衍生的U-MAP的注册。 Dusfe融合了从多种模态的知识，以重新校准每种模式的通道和空间特征。 Dusfe可以嵌入多个卷积层，以在不同的空间尺寸下实现特征融合。我们使用临床数据的研究表明，嵌入DUSFE的网络比以前的方法产生了较低的注册误差，因此更准确的AC SPECT图像。

translated by 谷歌翻译

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , Radu Soricut

分类：

2019-09-26

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer training times. To address these problems, we present two parameterreduction techniques to lower memory consumption and increase the training speed of BERT (Devlin et al., 2019). Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better compared to the original BERT. We also use a self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large. The code and the pretrained models are available at https://github.com/google-research/ALBERT. * Work done as an intern at Google Research, driving data processing and downstream task evaluations.

translated by 谷歌翻译

A Sequential Quadratic Programming Method with High Probability Complexity Bounds for Nonlinear Equality Constrained Stochastic Optimization

Albert S. Berahas , Miaolan Xie , Baoyu Zhou

分类： (统计)机器学习

2023-01-01

A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasonable assumptions, a high-probability bound on the iteration complexity of the algorithm to approximate first-order stationarity is derived. Numerical results on standard nonlinear optimization test problems illustrate the advantages and limitations of our proposed method.

translated by 谷歌翻译